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(54) Genetic engineering 

(57) It has been a problem to find an 
alternative, less time-consuming, and 
more reliable source of factor IX, a 
polypeptide which is essential to the 
human blood-clotting process and 
necessary for the treatment of 
patients with Christmas disease. In 
order to aid in the solution of the 
problem, there is provided 
recombinant DNA containing a DNA 
sequence occurring in the human 
factor IX genome, and includes 
recombinant DNA comprising 
substantially the whole sequence of 
human factor IX genome, which is 



inserted in a cloning vehicle and 
transformed into a host, such as 
Escherichia coli. Other fragments of 
the sequence have also been cloned 
and the invention includes DNA 
molecules comprising part or all of the 
human factor IX DNA. There is also 
described cDNA derived from human 
factor IX RNA. Uses include the 
provision of an intermediate of value 
in the genetic engineering of a factor 
IX polypeptide precursor and thence 
manufacture of the factor IX 
polypeptide, and in making probes for 
use in diagnosing the presence of 
normal or abnormal factor IX DNA in 
patients with Christmas disease. 
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1st amino acid 70 75 

sequence : Glu-Cys-Trp-Cys-Gln-Ala 

mRM : 5* GA^ Ug£ UGG UG^ CA^ GCN '3* 

Deoxyoligonucleotides 3* CtJ AC^ ACC AC* GTT CG (oligo N2A> 
synthesized : 

3* CT^ AC^ ACC AC* GTC CG (oligo N2B) 



2nd amino acid 348 352 

sequence : His-Met-Phe-Cys-Ala 

mRNA ; 5 ' CA c AUG UU C UG C GCN 3 ' 



Deoxyoligonucleotides AAA 

synthesized : GT* TAC AA q AC Q CG (oligo Nl) 



7c?. / 



2123499 




3/35 



2125409 



I 60 

|esnpclnggmckddinsy 
tgaatccaatccatgtttaaatggcggcatgtgcaaggatgacattaattcctat 

10 20 30 40 50 

70 80 90 

ECWCQAGFEGTNCELDATCSIK 
GAATGTYGGTGTCAAGC TGGATTTGAAGGAACGAACTGTGAATTAGATGCAACATGCAGCATTAA 
60 70 80 90 100 110 120 

100 

NGRCKQFCKRDTDNKVVC 
GAATGGCAGATGCAAGCAGTTTTGTAAAAGGGACACAGATAACAAGGTGGTTTGT 
130 140 150 160 170 

110 120 130 

SCTDGYRLAEDQKSCEPAVPFP 
TCCTGTACTGACGGATACCGACTTGCAGAAGACCAAAAGTCCTGCGAACCAGCAGTGCCATTTCC 
180 190 200 210 220 230 240 



CGRVSVSH 



140 

VRPRFHGL 



25p 



260 



150 

C S C * E 



CTGTGGACGAGTCTCTGTCTCACAT3TGAGGCCCCGCTTTCACGGTCTGTGTTCGTGC7GAGAA 



270 



280 



290 



300 
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FIG. 8(a) 
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FIG. 8(b) 
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FIG. 8(c) 



18 5 5 


0.156 


DDE1 


CTTAG 


1884 


0.159 


M3011 


TCTTC 


1901 


0.160 


AVA1 1 


GGACC 


1901 


0.160 


S AU961 


GGACC 


1939 


0.163 


MNL1 


CCTC 


1940 


0.163 


D0E1 


CTC AG 


1947 


0.164 


ALU1 


AGCT 


1965 


0.165 


WAE11 1 


GGCC 


1965 


0.165 


S AU961 


GGCCC 


2030 


0L 171 


RSA1 


GTAC 


2081 


0 1 175 


RSA1 


GTAC 


2097 


0.177 


HGA1 


GACGC 


21 1 0 


0.178 


ALU1 


AGCT 


211 2 


0.178 


QDE1 


CTC «G 


21 1 6 


0.178 


RSA 1 


GTAC 


21 2 8 


0.179 


M801 


GATC 


2141 


0.180 


MNL1 


CC TC 


21 4 7 


0.181 


MNL1 


CC TC 


21 50 


0.181 


FOK1 


CATCC 


21 5 8 


0.182 


MNL1 


CC T C 


21 61 


0.182 


MNL1 


CC TC 


21 6 5 


0.182 


MNL1 


CC TC 


21 7 1 


0.183 


a r c 1 
ACC 1 


b 1 « v3 M w 


2174 


0.183 


HINF1 


GACTC 


22 2 2 


0.187 


DDE1 


CTTAG 


22 2 5 


0.187 


ALU1 


AGCT 


2248 


0.189 


PST1 


CTGCAG 


2282 


0.192 


MST11 


CCTAAGG 


2283 


0.192 


0DE1 


CTAAG 


2287 


0.193 


FQK1 


GGATG 


2296 


0.193 


MNL1 


CCTC 


2301 


0.194 


ALU1 


AGCT 


234 9 


0.198 


35V1 


GCTGC 


2349 


0.198 


FNU4H1 


GCTGC 


24 2 2 


0.204 


HINF1 


GATTC 


2468 


0.208 


HINF1 


GATTC 


2483 


0.209 


3STE11 


GGTAACC 


2503 


0.211 


ALU1 


AGCT 


25 2 4 


0.212 


XBA1 


TC T AG A 


2534 


0.213 


0DE1 


CTAAG 



11/35 



2125499 



FIG. 8(d) 
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FIG. 8(e) 
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FIG. 8(f) 
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FIG. 8(g) 
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FIG. 8(h) 
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FIG. 8(i) 
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FIG. 8(j) 
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FIG. 8(k) 
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FIG. 8(L) 
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BamHI PvuII Hindlll 

i 

Oligo N3 5' GATCCAGCTGA 3 1 



Oligo N4 3' GTCGACTTCGA 5 1 

Fig- to 



Eco RI Clal Hindi II PvuII BamHI 

I 10 20 | \ 40 1 50 

5 ' * ~ GAA TTCTCATGTT TGACAGCTTA TCATCGAT AA GCTTCAGCTG GATCCT CTAC 
§0 

GCCGGACGCA 3 1 
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SPECIFICATION 
Genetic engineering 



recombinant DNA technology (genetic 
engineering). Thus, the cloning of DNA sequences 

65 which are substantially the same as extensive 
sequences occurring in the human factor IX 
genome has been achieved. 

The invention arises from the finding that an 
extensive DNA sequence of the human factor IX 

70 genome can be obtained by a clever and laborious 
combination of chemical synthesis and artificial 
biosynthesis, starting from elementary nucleotide 
or dinucleotide "building blocks", as will be 
described below. 

75 A major feature of the invention comprises 
recombinant DNA which comprises a cloning 
vehicle DNA sequence and a sequence foreign 
thereto (i.e. foreign to the vehicle) which is 
substantially the same as a sequence occurring in 

80 the human factor IX genome. A 1 1 873 nucleotide 
long part of such a foreign sequence has been 
identified and a very large part of it has been 
sequenced by the Maxam-Gifbert sequencing 
method. A 129 nucleotide length of this sequence 

85 is more than sufficient to characterise it 

unambiguously as coding for a specific protein and 
a particular such length is regarded herein as 
useful to characterise the whole sequence 
inserted in the cloning vehicle as one occurring in 

90 the human factor IX genome. Other cloned 

sequences can then be verified as belonging to the 
human factor IX genome by determining that part 
thereof is identical to a region of the first- 
mentioned sequence, i.e. the sequences have a 

95 common identity in an overlapping region. 

A further feature of the invention therefore 
comprises recombinant DNA which comprises a 
cloning vehicle or vector DNA sequence and a 
DNA sequence foreign thereto which consists of 
1 00 or includes substantially the following sequence of 
1 29 nucleotides (which should be read in rows of 
30 across the page): — 



BACKGROUND OF THE INVENTION 

1 . Field of the invention 

5 This invention is in the field of genetic 
engineering relating to factor IX DNA. 

2. Description of prior art 

Factor IX (Christmas factor or antihaemophilic 
factor B( is the zymogen of a serine protease 
1 0 which is required for blood coagulation via the 
intrinsic pathway of clotting (Jackson & 
Nemerson, Ann.Rev.Biochem. 49, 765 — 8 1 1 , 
1 980). This factor is synthesised in the liver and 
requires vitamin K for its biosynthesis (Di Scipio & 
1 5 Davie, Biochem. 18, 899 — 904, 1 979). 
Human factor IX has been purified and 
characterised, but details of the amino acid 
sequence are fragmentary. It is a single-chain 
glycoprotein, with a molecular weight of 
20 approximately 60,000 (Suomela, Eur.J.Biochem. 
71, 145—154, 1976). Like other vitamin K- 
dependent plasma proteins, human factor IX 
contains in the amino-terminal region 
approximately 12 gamma-carboxyglutamic acid 
25 residues (Di Scipio & Davie, Biochem. 18, 
899—904, 1979) 

During the clotting process, and in the presence 
of Ca 4 "* ions, factor IX is acted upon by activated 
factor IX (IXa) by the cleavage of two internal 
30 peptide bonds, releasing an activation 

giycopeptide of 10,000 daltons (Di Scipio et aL, 
J.Clin. Invest. 61, 1 528—1 538, 1 978). The 
activated factor IX (IXa) is composed of two 
chains held together by at least one disulphide 
35 bond. Factor IXa then participates in the next step 
in the coagulation cascade by acting on factor X in 
the presence of activated factor VIII, Ca ++ ions, 
and phospholipids (Lindquist etaL, J.BioLChem. 
253, 1902— 1909, 1978). 
40 Individuals deficient in factor IX (Christmas 
disease or haemophilia B) show bleeding 
symptoms which persist throughout life. Bleeding 
may occur spontaneously or following injury. This 
may take place virtually anywhere. Bleeding into 
45 the joints is common, and after repeated 
haemorrhages, may result in permanent and 
crippling deformities. The condition is a sex-linked 
disorder affecting males. Its frequency in the 
population is approximately 1 in 30,000 males. 
50 The current method of diagnosing Christmas 
disease involves measurement of the titre of factor 
IX in plasma by a combination of a clotting assay 
and in immunochemical assay. Treatment of 
haemorrhage in the disease consists of factor IX 
55 replacement by means of intravenous transfusion 
of human plasma protein concentrates enriched in 
factor IX. The enrichment of plasma in factor IX is 
a time-consuming process. 

Summary of the invention 
60 After considerable research and experiment, 
important progress has now been made towards 
producing artificial human factor IX by 



ATGTAACATG TAACATTAAG AATGGCAGAT 
GCGAGCAGTT TTGTAAAAAT AGTGCTGATA 
105 ACAAGGTGGT TTGCTCCTGT ACTGAGGGAT 
ATCGACTTGC AGAAAACCAG AAGTCCTGTG 
AACCAGCAG (1) 



The invention includes particularly recombinant 
DNA which comprises a cloning vehicle DNA 

110 sequence and a sequence foreign to the cloning 
vehicle, wherein the foreign sequence includes 
substantially the whole of an exon sequence of the 
human factor IX genome. The 129-nucleotide 
sequence described above corresponds 

1 1 5 substantially to such an exon sequence. Another 
such exon sequence which independently 
characterises the human factor IX DNA is the 203- 
nucleotide sequence substantially as follows 
(again reading in rows of 30 across the page): — 
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TGCCATTTCC ATGTGGAAGA GTTTCTGTTT 

CACAAACTTC TAAGCTCACC CGTGCTGAGG 

CTGTTTTTCC TGATGTGGAC TATGTAAATT 

CTACTGAAGC TGAAACCATT TTGGATAACA 

5 TCACTCAAAG CACCCAATCA TTTAATGACT 

TCACTCGGGT TGTTGGTGGA GAAGATGCCA 

AACCAGGTCA ATTCCCTTGG CAG 

The intron sequences of the human factor IX 
genome are excised during the transcription 

1 0 process by which mRNA is made in human cells. 
Only exon sequences are translated into protein. 
DNA coding for factor IX has been prepared from 
human mRNA. This cDNA has been partly 
sequenced and found to contain the same 1 29- 

1 5 and 203-nucleotide sequences set out above. 

The invention also includes recombinant DNA 
which comprises a cloning vehicle sequence and a 
DNA sequence foreign to the cloning vehicle, 
wherein the foreign sequence comprises a DNA 

20 sequence which is complementary to human 

factor IX mRNA. Such a recombinant cDNA can be 
isolated from a library of recombinant cDNA 
clones derived from human liver mRNA by using 
an exon of the genomic human factor IX DNA (or 

25 part thereof) as a probe to screen this library and 
thence isolating the resulting clones. 

The invention also includes recombinant DNA 
in which the foreign sequence is any fragment of 
human factor IX DNA, particularly of length at 

30 least 50 and preferably at least 75 nucleotides or 
base-pairs. It includes such recombinant DNA 
whether or not part of the 1 29 or 203-base-pair 
sequence defined above. It includes especially part 
or all of the exon sequences of human factor IX 

35 genomic DNA. Various short lengths up to about 
1 1 kilobases (1 1 ,000 nucleotides or base-pairs) 
long have been prepared by use of various 
restriction endonucleases. Methods of isolating 
recombinant DNA from clones are well known and 

40 some are described hereinafter. The DNA of the 
invention can be single or double stranded form. 

The recombinant human factor IX DNA of this 
invention is useful as a tool of recombinant DNA 
technology. Thus it is useful as the first stage in 

45 the production of artificial human factor IX and in 
the preparation of probes for diagnostic purposes. 

In the production of the artificial human factor 
IX it is contemplated that appropriate cDNA or 
genomic clones will be introduced into a suitable 

50 expression vector in either mammalian or bacterial 
systems. For mammalian studies, the gene might 
be too long to be conveniently retained in one 
clone. Therefore a suitable artificial "minigene" 
will be designed and constructed from suitable 

55 parts of the cDNA and genomic clones. The 
minigene will be under the control of its own 
promoter or instead will be replaced by an artificial 
one, perhaps the mouse metallothioneine I 



promoter. The resultant 'minigene' will then be 
60 introduced into mammalian tissue culture ceils 
e.g. a hepatoma cell line, and selection for clones 
of cells synthesising maximum amounts of 
biologically active factor IX will be carried out. 
Alternatively "genetic farming" could be employed 
65 as has been demonstrated for mouse growth 
hormone (Palmiter eta/, Nature 300. 61 1 — 615, 
1 982). The minigene would be micro-injected into 
the pronucleus of fertilised eggs, followed by in 
vivo cloning and selection for progeny producing 
70 the largest quantity of human factor IX in blood. 
Alternatively, it is contemplated that the cDNA 
clone or selected parts of it will be linked to a 
suitable strong bacterial pro motor, e.g. a Lac or 
Trp promoter or the iamdba P R or P L , and a factor 
75 IX polypeptide obtained therefrom. 

The natural factor IX polypeptide is synthesised 
as a precursor containing both a signal and 
propeptide region. They are both normally cleaved 
off in the production of the definitive length 
80 protein. Even this product is merely a precursor. It 
is biologically inactive and must be gamma- 
carboxylated at 12 specific N-terminal glutamic 
acid residues in the so called 'GLA' domain by the 
action of a specific vitamin K-dependent 
85 carboxylase. In addition, two carbohydrate 
molecules are added to the connecting peptide 
region of the "molecule, but is remains unknown 
whether they are required for activity. The 
substrate for the carboxylase is unknown and 
90 could be the precursor factor IX polypeptide or 
alternatively the definitive length protein. 
Therefore various relevant polypeptides both with 
and without the precursor domains will be 
"constructed" using genetic engineering methods 
95 in bacterial hosts. They will then be tested as 
substrates for the conversion of inactive to 
biologically active factor IV in vitro by the action of 
partially purified preparations of the carboxylase 
enzyme which can be isolated from liver 
1 00 microsomes or other suitable sources. 

For diagnostic purposes, the recombinant 
human genomic factor IX DNA or recombinant 
human mRNA-derived factor IX DNA has a wide 
variety of uses. It can be cleaved by enzymes or 
1 05 combinations of two or more enzymes into shorter 
fragments of DNA which can be recombined into 
the cloning vehicle, producing "sub-clones". 
These sub-clones can themselves be cleaved by 
restriction enzymes to DNA molecules suitable for 
1 10 preparing probes. A probe DNA (by definition) is 
labelled in some way, conveniently radiolabeled, 
and can be used to examine in detail mutations in 
the human DNA which ordinarily would produce 
factor IX. Several different probes have been 
1 1 5 produced for examining several different regions 
of the genome where mutation was suspected to 
have occurred in patients. Failure to obtain 
hybridisation from such a probe indicates that the 
sequence of the probe differs in the patient's DNA. 
120 In particular it has been shown that Christmas 
disease can be detected or confirmed by such 
methodology. Useful probes can contain intron 
and/or exon regions of the genomic DNA or can 
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contain cDNA derived from the mRNA. 

The invention includes particularly probe DNA, 
i.e. which is labelled, and of a length suitable for 
the probing use envisaged. It can be single- 
5 stranded or double-stranded over at least the 
human factor IX DNA probing sequences thereof 
and such sequences will usually have a length of 
at least 1 5 nucleotides, preferably at least 1 9 — 30 
nucleotides in order to have a reasonable 
1 0 probability of being unique They will not usuaUy be 
larger than 5 kb and rarely longer than 1 0 kb. 

The invention accordingly includes a DNA 
molecule, comprising part of the human factor IX 
DNA sequence, whether or not labelled, whether 
1 5 intron or exon or partly both. It also includes 
human cDNA corresponding to part of ail of 
human factor IX mRNA. It includes particularly a 
solution of any DNA of the invention, which is a 
form in which it is conveniently obtainable by 
20 electroelution from a gel. 

The invention includes, of course, a host 
transformed with any of the recombinant DNA of 
the invention. The host can be a bacterium, for 
example an appropriate strain of Eco//, chosen 
25 according to the nature of the cloning vehicle 
employed. Useful hosts may include strains of 
Pseudomonas, Bacillus subtilis and Bacillus 
stearothermophilus, other Bacilli yeasts and other 
fungi and mammalian (including human) cells. 
30 One process practised in connection with this 
invention for preparing a host transformed with 
the recombinant DNA of the invention is based on 
the following steps: — 

{1) synthesising an oligodeoxynucleotide 
35 having a nucleotide sequence comprising that 
occurring in bovine factor IX messenger RNA 
coding for amino acids 70 — 75 or 348 — 352 of 
bovine factor IX, and labelling the 
oligodeoxynucleotide to form a probe; 
40 (2) preparing complementary DNA to a mixture 
of bovine mRNAsj 

(3) inserting the complementary DNA in a 
cloning vector to form a mixture of recombinant 
bovine cDNAs; 
45 (4) transforming a host with said mixture of 
recombinant bovine cDNAs to form a library of 
clones and multiplying said clones; 

(5) probing the clones with the synthetic 
oligodeoxynucleotide probe obtained in step 1 and 

50 isolating the resultant recombinant bovine factor 
IX cDNA-containing clone; 

(6) digesting the recombinant bovine factor IX 
cDNA from said clone with one or more enzymes 
to produce a bovine factor IX cDNA molecule 

55 comprising a shorter sequence of bovine factor IX 
DNA, but preferably at least 50 base-pairs long; 
and 

(7) probing a library of recombinant human 
genomic DNA in a transformed -host with the 

60 shorter sequence bovine factor IX cDNA molecule, 
to hybridise the human genomic DNA to the said 
recombinant bovine factor IX DNA and isolating 
the resultant recombinant DNA-transformed host. 



Brief description of the drawings 
65 Figure 1 shows the structure of a published 
amino-acid sequence of bovine factor IX 
polypeptide, the deduced sequence of the mRNA 
from which it would be translated and the 
structures of oligonucleotides (oligo-N1 and N2) 
70 synthesised in the course of this invention; 

Figures 2 and 3 show the chemical formulae of 
"building blocks" used to synthesise the 
oligonucleotides referred to in Figures 1 and 1 1 ; 
Figure 4 is an elevational view, partly sectioned, 
75 showing an apparatus for synthesising 
oligonucleotides; 

Figure 5 shows the sequence of part of the 
bovine factor IX cDNA obtained in this invention; 
Figure 6 is a map showing the organisation of 
80 an approximately 27 kb length of human factor IX 
genomic DNA and is divided into five portions, 
showing: — 

(a) the exon regions; 

(b) the 1 1 ,873- nucleotide length sequenced; 
85 (c) cDNA molecules obtained by restriction with 

various endonucleases, sub-cloned and 
subsequently used as probes; 

(d) DNA molecules obtained by restriction with 
various endonucleases; and 
90 (e) three regions of human factor IX genomic 
DNA derived from three clones in lambda phage 
vector. 

Figure 7 shows the sequence of the DNA of 
Figure 6{b) and in parts the encoded protein; 
95 Figure 8 shows a restriction enzyme chart of 
the sequence shown in Figure 7; 

Figure 9 shows part of the sequence of the 
human factor IX cDNA and its encoded protein; 
Figure 1 0 shows the structure of a pair of 
1 00 complementary oligonucleotides {oligo N3 and 
N4) synthesised in the course of this invention; 

Figure 1 1 shows part of the DNA sequence of 
the vector pAT1 53/Pvull/8 of this invention, in the 
region where it differs from pAT1 53; 
1 05 Figure 1 2 is a diagram of piasmid pHIX1 7 of 
the invention showing the origin of the 1 .4 kb 
fragment used for probing and initial sequencing; 
and 

Figure 1 3 shows the position of the major 
1 1 0 radioactive bands on probing a "Southern blot" of 
normal human DNA, cut by the restriction 
enzymes E coRi(E), HindWW), Bgfl\{B) and 5dl(Bc), 
with a sub-clone of the recombinant human factor 
IX DNA of this invention. 

1 1 5 DESCRIPTION OF PREFERRED EMBODIMENTS 
1 . General description 

A recombinant DNA of the invention can be 
extracted by meats of probes from a library of 
cloned human genomic DNA. This is a known 

1 20 recombinant library and the invention does not, of 
course, extend to human genomic factor IX DNA 
when present in such a library. The probes used 
were of bovine factor IX cDNA (DNA" 
complementary to bovine mRNA), which were 



prepared by an elaborate process involving firstly 
the preparation of recombinant bovine cDNA from 
a bovine mRNA starting material, secondly the 
chemical syntheses of oligonucleotides, thirdly 
5 their use to probe the recombinant bovine cDNA, 
in order to extract bovine factor IX cDNA and 
fourthly the preparation of suitable probes of 
shorter length from the recombinant bovine factor 
IX cDNA. The first probe tried appeared to contain 
1 o an irrelevant sequence and the second probe tried 
not containing it, proved successful in enabling a 
single clone of the human genomic factor IX DNA 
to be isolated. This clone is designated lambda 
HIX — 1 . The steps involved are described in more 
1 5 detail in the sub-section "Examples" appearing 
hereinafter, and the second probe comprises the 
247 base-pair DNA sequence of bovine factor IX 
cDNA indicated in Figure 5 of the drawings. The 
invention therefore provides specifically a 
20 recombinant DNA which comprises a cloning 
vehicle sequence and a DNA sequence foreign to 
the cloning vehicle, which recombinant DNA 
hybridises to a 247 base-pair sequence of bovine 
factor IX cDNA indicated in Figure 5 (by the 
25 arrows at each end thereof). 

The cloning vehicle or vector employed in the 
invention can be any of those known in the 
genetic engineering art (but will be chosen to be 
compatible with the host). They include E.colL 
30 plasmids, e.g. pBR322, pAT1 53 and modifications 
thereof, plasmids with wider host ranges, e.g. RP4 
plasmids specific to other bacterial hosts, phages, 
especially lambda phage, and cosmids. A cosmid' 
cloning vehicle containing a fragment of phage 
35 pNA including its cos (cohesive-end site) inserted 
in a plasmid. The resultant recombinant DNA is 
circular and has the capacity to accommodate 
very large fragments of additional foreign DNA. 
Fragments of human factor IX genomic DNA 
40 can be prepared by digesting the cloned DNA with 
various restriction enzymes. If desired, the 
fragments can be ceiigated to a cloning vehicle to 
prepare further recombinant DNA and thereby 
obtain "sub-clones". In connection with this 
45 embodiment a new cloning vehicle has been 
prepared. This is a modified pAT1 53 plasmid 
prepared by ligating a BamH\ and Hind\\\ double 
digest of pATl 53 to a pair of complementary 
double sticky-ended oligonucleotides having a 
50 DNA sequence providing a BamH\ restriction 

residue at one end, a HindlW restriction residue at 
the other end and a PvuW restriction site in 
between. 

While the invention is described herein with 
55 reference to human genomic factor IX DNA in 
particular, the invention includes human factor IX 
cDNA (complementary to hu^ian factor IX mRNA) 
which contains substantially the same sequences. 
A library of human cDNA has been prepared and 
60 probed with human factor IX genomic DNA to 
isolate human factor IX cDNA from the library. For 
this purpose the probe DNA is conveniently of 
relatively short length and must include at least 
one exon sequence. The invention therefore 
65 includes a process of preparing a host transformed 



with recombinant DNA, comprising cloning vector 
sequences and a sequence of nucleotides 
comprised in cDNA complementary to human 
factor IX mRNA, which process comprises probing 
70 a library of clones containing recombinant DNA 
complementary to human mRNA with a probe 
comprising a labelled DNA comprising a sequence 
complementary to part or all of an exon region of 
the human factor IX genome. 

75 2. Examples 

A. Bacteria used 
E.coliK— 12 strain MC 1061 (Casadaban & 

Cohen, J.Mol.Biol. 138, 179—207, 1 980), E.coli 
K— 12 strain HB 101 (Boyer & Roulland-Dussoix, 
80 J.Mol.Biol 41, 459— 472, 1969) and£.co//K— 12 
strain K803 which is a known strain used by 
genetic engineers. 

B. Source and purification of bovine factor IX, anti- 
85 bovine factor IX antibody, and bovine mRNA ' 

Highly purified bovine factor IX and rabbit anti- 
bovine factor IX antiserum were gifts from Dr. M. 
P. Esnouf. Analysis of the purified bovine factor IX 
on a denaturating polyacrylamide gel showed that 
90 it has a purity of greater than 99%. Specific anti- 
factor IX immunoglobulins used for 
immunoprecipitation experiments were purified as 
described by Choo etaL, Biochem.J. 199, 
527 — 5 35, 1 98 1 , by passage of the crude 
95 antiserum through a Sepharose — 4B column onto 
which pure bovine factor IX has been coupled. 

Bovine mRNA was obtained from calf liver and 
isolated by the guanidine hydrochloride method 
(Chirgwin etaL, Biochem. 18, 5294 — 5299, 
1 00 1 979). The mRNA preparation was passaged 

through an oligo dT-cellulose column (Caton and 
Robertson, Nucl. Acids Res. 7, 1445 — 1456, 
1 979) to isolate poly(A) + mRNA. 
Poly(A) + mRNA was translated in a rabbit 
1 05 reticulocyte cell-free system in the presence of 
35 S-cysteine as described by Pelham and Jackson 
(Eur. J.Biochem. 67, 247—256, 1 976). At the ' 
end of the translation reaction, factor IX 
polypeptide was precipitated by the addition of 
1 1 0 specific anti-factor IX immunoglobulins. The 

immunoprecipitation procedure was as described 
by Choo et a/., Biochem.J. 181, 285 — 294, 1 979. 
The Immunoprecipitated material was washed 
throughly and resolved on a two-dimensional 
1 1 5 SDS-poIyacrylamide gel (Choo et a/., Biochem.J. 
181, 285—294, 1 979), by isoelectric focussing in 
one dimension and electrophoresis in another. 
Some polypeptides of known molecular weight 
were subjected to this procedure, to serve as 
1 20 reference points. The immunoprecipitated material 
showed 4 pronounced spots, ail in the 50,000 
molecular weight region and with separated 
isoelectric points. These predominant spots of 
molecular weight about 50,000 represent a single 
1 25 polypeptide chain plus a possible prepeptide 
signal sequence, a deduction compatible with 
published data (Katayama etaL, Proc. Natl.Acad. 
Sci.USA 76, 4990—4994, 1 979). 

When the gel analysis was repeated for the 
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same materia! but immunoprecipitated in the 
presence of unlabelled pure bovine factor IX, the 4 
spots appeared at reduced intensity, indicating 
that the translation product is specifically 
5 competed for by pure factor IX. Thirdly, 
immunoprecipitation was performed using a 
control rabbit antiserum, i.e. from a rabbit which 
had not been immunised with factor IX. None of 
the 4 spots appeared. These results therefore 

1 0 indicate that the translation product was a factor 
IX polypeptide. 

The specific immunologicai/cell-free translation 
assay established above was used to monitor the 
enrichment of factor IX mRNA on sucrose gradient 

1 5 centrifugations. Total poly(A) + mRNA was 

resolved by two successive separations by sucrose 
gradient centrifugations. When individual fractions 
. from the gradient were assayed by the above 
method, a fraction of size 20 — 22 Svedberg units 

20 (approx. 2.5 kilobases of RNA) region was found 
to be enriched (approx. ten-fold) for the bovine 
factor IX mRNA. This enriched fraction was used 
in the subsequent cloning experiments. 

25 C. Synthesis of specific bovine factor IX 
deoxyoligonucleotide mixtures 

Starting from a knowledge of the amino acid 
sequence of bovine factor IX {Katayama et a/. t 
Proc.NatI.Acad.Sci. USA 76 , 4990 — 4994, 

30 1 979), the synthesis of two mixtures of 

oligonucleotide probes was designed. These 
probes consisted of DNA sequences coding for 
two different regions of the protein. The regions 
selected were those known to differ in sequence in 

35 the analogous serine proteases, prothrombin, 
Factor C and Factors VII and X and were those 
corresponding to amino acids 70 — 75 and 
348 — 352 respectively. The 70 — 75 region was 
particularly favourable in that the mixture of 

40 oligonucleotides synthesised, i.e. oligo N2A and 
oligo N2B, contained all 1 6 possible sequences 
that might occtirin a 17 nucleotide long region of 
the mRNA corresponding to amino acids 70 — 75. 
The oligo N2A — N2B mixture is hereinafter called 

45 "oligo N2 r ' for brevity. 

Figure 1 of the drawings shows the two 
selected regions of the known amino acid 
sequence of bovine factor IX, the corresponding 
mRNA and the oligonucleotides synthesised. 

50 Since some of the amino acids are coded for by 
more than one nucleotide triplet, there are 4 
ambiguities in the mRNA sequence shown for 
amino acids 70 — 75 and therefore 1 6 possible 
individual sequences. 

55 - The nucleotide mixtures oligo N1 and oligo N2 
were synthesized using the solid phase 
phosphotriester method of Duckworth et a/., 
NucLAcids Res. 9, 1 69 1 — 1 706, 1981, modified 
in two ways. Firstly, o-chlorophenyl rather than p- 

60 chloropheny! blocking groups were used for the 
phosphotriester grouping, and were incorporated 
in the mononucleotide and dinucleotide "building 
blocks". Figures 2 and 3 of the drawings show (a) 
dinucleotide and (b) mononucleotide "building 

65 blocks". DMT = 4,4' - dimethoxytrityl and B = 6~ 



N-benzoyI-adenin-9-yI, 4-N-benzoylcytosin-1 -yl, 
2-N-isobutyrylguanin-9-yl or thymin-1 -yl, 
depending on the nucleotide selected. Secondly, 
the "reaction cell" used for the successive 

70 addition of mono- or dinucleotide "building 

blocks" was miniaturised so that the coupling step 
with the -condensing agent 1-(mesitylene-2- 
sulphonyI)-3-nitro-1,2,4-triazoIe (MSNT) was 
carried out in a volume of 0.5ml pyridine 

75 containing 3.5 micromoles of 

polydimethylacrylamide resin, 17.5 micromoles of 
incoming dinucleotide { or 35 micromoles of 
mononucleotide) and 210 micromoles of MSNT. 
Figure 4 of the drawings is an elevational view 

80 of the microreaction cell 1 and stopper 2 used for 
oligonucleotide synthesis, drawn 70% of actual 
size. Thadevice comprises a glass-to-PTFE tubing 
joint 3 at the inlet end of the stopper 2. The 
stopper has an internal conduit 4 which at its 

85 lower end passes into a hollow tapered ground 
glass male member 5 and thence into a sintered 
glass outlet 6 to the stopper. The cell 1 has a 
ground glass female member 7 complementary to 
the member 5 of the stopper, leading to reaction 

90 chamber 8 r the lower end of which terminates in a 
sintered glass outlet 9. This communicates with 
glass tubing 1 0 and a 1 .2mm. "Interflow" tap 1 1 . 
Further glass tubing 1 0, beyond the tap 1 1 , leads 
to the outlet glass-to-PTFE tubing joint 1 2. Pairs 
of ears 1 3 on the stopper and cell enable them to 
be joined together by springs (not shown) in a 

95 liquid-tight manner. 

After completion of the synthesis and 
deprotection, fractionation was carried out by high 
pressure liquid chromatography {Duckworth et al„ 
see above) and the peak tubes corresponding to 
1 00 the product of correct chain length were located 
by labelling of fractions at their 5'-hydroxyl ends 
using [gamma- 32 p]-ATP and T4 polynucleotide 
kinase, followed by 20% 7M urea polyacrylarnide 
gel electrophoresis. The position on the gel of the 
1 05 17- and 1 4- oligonucleotides was determined by 
separately labelling, by the method described 
above, 1 1- and 1 4- nucleotide long "marker" 
oligonucleotides and subjecting these to the same 
gel electrophoresis. 

110 

D. Preparation of libraries of cDNA sequences for 
bovine mRNA 

Two different approaches were used for the 
generation of cloned cDNA library: — 

115 (i) Mbol library First strand cDNA was 

synthesised using the sucrose gradient-enriched 
poly(A) + bovine mRNA as template. The 
conditions used were as described by Huddleston 
& Brownlee, Nucl. Acids Res. 10, 1 029 — 1030, 

120 1981, except that 2 micrograms of oligo N — 1 , 
20 — 30 micrograms of the mRNA, 10 microcuries 
[alpha- 32 P]-dATP (Amersham, 3000 Ci/mmole), 
and 50 U of reverse transcriptase were used in a 
50 microlitre reaction. "dNTP" in Figure 1 denotes 

1 25 the mixture of 4 deoxynucleoside triphosphates 
required for synthesis. Oligo N — 1 hybridises to 
the corresponding region on the mRNA (refer to 
Figure 1 ) and thereby acts as a primer for the 
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initiation of transcription. It was used in order to 
achieve a further enrichment for factor IX mRNA. 
At the end of the cDNA synthesis reaction, the 
cDNA was extracted with phenol and desalted on 
5 a Sephadex-G 1 00 column, before it was treated 
with alkali (0.1 M NaOH, 1 mM EDTA) for 30 min. at 
60°C to remove the mRNA strand. Second strand 
DNA synthesis was then carried out exactly as 
published (Huddleston &Brownlee, Nucl.Acids 

10 Res. 10, 1029—1038, 1981). 

The double-stranded DNA was next cleaved 
with the restriction enzyme Mbol and ligated to 
the plasmid vector pBR322 which had been cut 
with BamH\ and treated with calf intestinal 

1 5 alkaline phosphatase to minimise vector self- 

religation. Phosphatase treatment was carried out 
by incubating 5 micrograms of £c?/nHl-cut pBR 
"322 plasmid with 0.5 microgram calf intestinal 
phosphatase (Boehringer; in 1 0mM Tris — HCI 

20 buffer, pH 8.0) in a volume of 50 microlitres at 
37°C for 10 minutes, see Huddleston & Brownlee 
supra. 

The ligated DNA was used to transform Ecofl 
strain MC 1061. For transformation E.coli MC 

25 1 061 was grown to early exponential phase as 
indicated by an absorbancy of 0.2 at 600 nm and 
made "competent" by treating the pelleted 
bacterial cells first with one half volume, followed 
by repelleting, and then with 1/50 volume of the 

30 original growth medium of 100mM CaCI 2 1 5% v/v 
glycerol and 1 0mM PIPES— NaOH, pH 6.6 at 0°C. 
Cells were immediately frozen in a dry ice/ethanol 
bath to — 70°C. For transformation, 200 microlitre 
aliquots were mixed with 10 microlitres of the 

35 recombinant DNA and incubated at 0°C for 1 0 
minutes followed by 37°C for 5 minutes. 200 
microlitres of L-broth (bactotryptone 1 0g., yeast 
extract 5g., sodium chloride 10g., made up to 1 
litre with deionised water) were then added and 

40 incubation continued for a further 30 minutes at 
37 °C. The solution was then plated on the 
appropriate antibiotic agar (see below). A library of 
about 7,000 ampicillin-resistant colonies was thus 
obtained. They were ampicillin-resistant because 

45 they contained the beta-la eta mase gene of pBR 
322. Of these, aprox. 85% were found to be 
tetracycline-sensitive. 

(ii) dC/dG tailed library In the preparation of this 
library, first strand cDNA was synthesised as 

50 described for the above library except that oligo 
dT m _ 18} was used as a primer to initiate cDNA 
synthesis. Following this, the cDNA was tailed 
with dCTP using terminal transferase and back- 
copied with the aid of oligo dG (12 _ 18 , primer and 

55 reverse transcriptase to give double stranded 
DNA, exactly according to the method of Land et 
aL, Nucl.Acids Res. 5, 225 1 — 2266, 1 98 1 . After a 
further tailing with dCTP, this material was 
annealed by hybridisation to a dGTP-tailed 

60 PBR322 plasmid at the Pst\ site. The hybrid DNA 
was used to transform E.coli strain MC 1061. A 
library of approximately 10,000 tetracycline- 
resistant colonies was obtained. Of these, 
approximately 80% were found to be sensitive to 

65 ampicillin, due to insertion of DNA into the 



'ampicillin-resistant gene at the Pst\ site. 

E. Isolation of specific bovine factor IX clones 
(i) From Mbol library 

The library of colonies, in an unordered fashion, 

70 was transferred onto 1 3 Whatman 541 filter 
papers and amplified with chloramphenicol, to 
increase the number of copies of the plasmid in 
the colonies, as described by Gergen et aL, Nucl. 
Acids Res., 7,211 5—2 1 36 ( 1 979). The filters 

75 were pre-hybridised at 65°C for 4h in 6 x NET 
(1 x NET = 0.1 5m NaCI, 1 mM EDTA, 1 5mM Tris- 
HCi, pH 7.5), 5 x Denhardt's, 0.5% NP40 non- 
ionic surfactant and 1 microgram/ml. yeast RNA 
as described by Wallace et aL, Nucl. Acids Res. 9, 

80 879 — 894 (1 981 ). Hybridisation was carried out 
at 47 °C for 20h in the same solution containing 
3 x 1 0 5 cpm (0.7 nanogram/ml) of labelled oligo 
N — 2 probe. Labelling was done by 
phosphorylation of the oligonucleotides at the 5' 

85 hydroxyl end using [gamma- 32 P]-ATP and T4 
phophokinase (Huddleston & Brownlee, 
Nucl.Acids Res. 10, 1029 — 1038, 1981). At the 
end of the hybridisation, filters were washed 
successively at 0— 4°C (2h), 25°C (1 0 min), 

90 37°C (10 min) and 47°C (10 min). After 

radioautography of the filters from this screening, 
one colony showed a positive signal above 
background. This colony was designated BIX — 1 
clone. 

95 (ii) From dC/dG-taifed library 

Screening of this library, in an ordered array 
fashion, using oligo N — 2 probe as described 
above has resulted in the identification of a 
positive clone. This was designated BIX — 2 clone 

1 00 F. Sequence characterisation of bovine factor IX 
cDNA clones 

Characterisation of BIX — 1 clone by restriction 
endonuclease cleavage indicated that it contained 
a DNA insert of about 430 base-pairs (data 

1 05 omitted, for brevity). Figure 5 shows part of the 
nucleotide sequence of the coding strand, 
determined by the Maxam-Gilbert method, 
extending over 304 nucleotides and provides 
direct evidence that it has the identity of a bovine 

110 factor IX sequence. Thus, nearly all of this 304 
nucleotide sequence (corresponding to the amino 
acid residues 52 — 1 39) agrees with the 
nucleotide sequence predicted from the known 
bovine factor IX amino acid sequence data 

1 1 5 (Katayama et aL, Proc.Natl.Acad.Sci. 76, 

4990 — 4994, 1 979). Over this region, there are 
no discrepancies between BIX — 1 and these 
published data for factor IX, except at nucleotides 
38 — 40 where the amino acid coded for is Asp 

1 20 instead of Thr. This amino acid change was 

similarly observed in a second, independent cDNA 
clone (BIX — 2; see below). The remainder of the 
304-nucleotide sequence, i.e. that shown in 
brackets in Figure 5, does not agree with the 

1 25 published bovine factor IX amino acid data of 
Katayama. 

In Figure 5, the underlined portion denotes the 
sequence corresponding to the oiigo N — 2 probe 
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sequence, the asterisk denotes a nonsense codon, 
the brackets enclose a sequence which does not 
correspond to Katayama's amino acid data and 
the arrows indicate Hinfl restriction sites. The 
5 Katayama numbering system for amino acids is 
shown and this sequence is in the opposite 
orientation to the direction of transcription of the 
tetracyciine-resistant gene of the plasmid. 

By similar methods, BIX — 2 clone was found to 
1 o have a DNA insert of 1 02 nucleotides and this 
spans the nucleotide positions 7 — 1 08 as shown 
in Figure 5. The nucleotide sequences for BIX — 1 
and BIX — 2 clones over this region (nucleotide 
7 — 108) were identical. 

15 

G. Isolation of human factor IX gene 

(i) Initial clone — lambda HIX — / 

A library of cloned human genomic DNA, 
namely a Hae\\\/Mu\ lambda phage Charon 4A 
20 library prepared by Lawn et aL, Cell, / 5, 
1 1 57—1 1 74, 1 978, was used. 1 0 6 phage 
recombinants from this library were screened 
using the in situ plaque hybridisation procedure as 
described by T. Maniatis et aL, Cell, 15, 687, 
25 1 978. P re-hybridisation and hybridisation were 
carried out at 42 °C in 50% formamide. After 
hybridisation, filters were washed at room 
temperature with 2 x SSC (1 x SSC = 0.1 5mM 
NaCI, 1 5mM sodium citrate, at pH 7.2) and 0.1% 
3 q SDS, then at 65°C with 1 x SSC and 0.1% SDS. 
Two DNA molecules, being restriction 
fragments from the factor IX cDNA cloned in 
BIX — 1, were radiolabeled and used as probes in 
the hybridisation. The first fragment corresponds 
35 to nucleotide numbers —8 to 31 7 on the 

numbering system of Figure 5, and was isolated 
by Sau3A\ digestion of BIX — 1 plasmid DNA. The 
isolated DNA was labelled to high specific activity 
by incorporation of [alpha — 32 P] -dATP using a 
40 nick. translation (Rigby et aL, J. Mol.Biol. 113, 

237 — 251 , 1 977, modified, vide infra). Using this 
probe, 1 0 clones' were isolated. These were 
plaque-purified and re-hybridised with a 247- 
nucteotide fragment from BIX- — 1 clone. This 
fragment, derived from nucleotides 3 — 249 can 
be seen from Figure 5. It contains only sequences 
in agreement with the Katayama bovine factor iX 
amino acid sequence and was isolated by Hini\ 
digestion of BIX — 1 plasmid DNA. Only a single 
50 clone gave a positive hybridisation signal with this 
247-nucleotide probe. This clone was further 
plaque-purified and the resulting clone was 
designated "lambda HIX — 1". 

(ii) Subsequent genomic clones 

55 A sub-clone, pATIXcVIl, of recombinant human 
factor IX cDNA from human liver mRNA, and 
prepared as described in Section L below, was 
linearised by digestion with Hind\\\ and BamHL 
The resulting 2 kb cDNA molecule was purified by 

6 q 1 % agarose gel electrophoresis. After 

electroelution, about 100 ng of this cDNA was 
nick-translated with [alpha 32 p] dATP (see above) 
and used as a hybridisation probe to screen the 
Hae\\\/Alu\ lambda phage Charon 4A human 

gg genomic DNA library for further genomic clones. 



using standard stringent hybridisation conditions. 
Two further human factor IX genomic clones, 
designated lambda HIX — 2 and lambda HIX — 3, 
were thus obtained. 

70 H. Characterisation of human factor IX genomic 
clones 

0! Restriction map 

The initial lambda HIX — 1 clone was 
characterised by cleavage with various single and 

75 double digests with different restriction 

endonucleases and Southern blotting of fragments 
using the bovine factor IX cDNA probe (results 
omitted for brevity). The subsequently isolated 
lambda HIX — 2 and 3 clones were characterised 

80 in the same way except that the human cDNA 
probe, pATIXcVIl (see Section L below) was used 
for the Southern blots. From these results it 
emerged that the sequences in the factor IX 
genome corresponding to lambda HIX — 2 and 3 

85 overlapped with lambda HIX — 1 as shown in 
Figure 6(e). In Section (d) of this Figure 6 are 
summarised the results of the analysis using the 
restriction enzymes EcoRI (E), Hind\\\ (H), BglW (B), 
BamH\ (Ba) and PvuW (P), and this serves as a 

90 restriction enzyme map. 
(ii) Sequencing 

Numerous sub-clones were isolated from a 
knowledge of the rectriction enzyme map as 
described in Section J(ii) below, the majority in a 

95 vector pAT1 53/Pvuil/8. Examples of these sub- 
clones are shown in Figure 6(c) and a number 
were used and were of a convenient length for 
sequence analysis by the Maxam-Gilbert method 
(Maxam & Gilbert, Proc.Natl.Acad.ScLUSA 74, 
100 56—564,1980). 

Initially sequencing was done on part of a 1 .4 
kb EcoRI restriction fragment from the sub-clone 
pHIX — 1 7, see below and J(i). A 403-nucleotide 
(base-pair) length was sequenced, of which a 
1 05 1 29-nucleotide length was identified as lying 

within an exon region. This is the 129-nucieotide 
sequence used above to define the factor IX DNA. 

Subsequently, a region of 1 1 873 bases was 
sequenced in the central portion of the gene [see 

1 1 0 Figure 6(b)]. Figure 7 shows the sequence of one 
strand of the DNA. The nucleotides are arbitrarily 
numbered from 1 to 1 1 873 in the 5' to 3' 
direction. The original 403-nucleotide sequence 
runs from Figure 7 nucleotides Nos. 4372 to 4774 

1 1 5 and is indicated by 0 — O'. The 1 29-nucleotide 
sequence lying within the 403 one, runs from 
Figure 7 nucleotides Nos. 4442 to 4570 and is 
indicated by J — J'. This corresponds exactly to the 
"w" exon. 

1 20 |n detail, the sequence of nucleotides Nos. 
1 — 7830 contains two short exons (nucleotides 
4442 — 4570 and 7140 — 7342 respectively) 
marked w and x in Figure 6(a), J— J' and J' — J" in 
Figures 7 and 9. These code for amino acids 

125 85 — 127, and 128 — 1 95 respectively of the 
amino acid sequence predicted fronTthe human 
factor IX cDNA clone (Figure 9). There are no 
differences in amino acid sequences predicted 
from the genomic and cDNA clones of the 



8 



GB 2 125 409 A 8 



invention in these two exon regions. The sequence 
of the gene between residues 7831 — 1 1873 is 
less complete, containing several gaps, but is still 
a useful characterisation of the gene as it contains 
5 two "Alu\ repeat" sequences, nucleotides 

7960 — 81 55 and 9671 — 9938. A/ul sequences 
are found in many genes. The repetition is not 
exact but there is a typical degree of homology 
between them. This further characterisation 
1 0 provides a useful cross-check on the accuracy of 
the restriction enzyme map. This emerges more 
clearly from the restriction enzyme chart of Figure 
8. 

Figure 8 is a chart produced by a computer 

1 5 analysis of the sequence data of the 1 1 873 

nucleotide long sequence of Figure 7. Column 1 of 
Figure 8 gives the arbitrary nucleotide number 
allotted to the nucleotide of Figure 7. Column 2 
apportions the nucleotide number as a fraction of 

20 the whole sequence. Column 3 shows the 

restriction enzymes which will cut the DNA within 
various short sequences of nucleotides shown in 
Column 4. The short sequences of Column 4 begin 
with the nucleotide numbered in Column 1 . With 

2 5 the aid of this chart the positions of the restriction 
sites shown in Figure 6(d) and some of the 
sequences shown in Figure 6(c) can be 
determined very accurately. For example 
sequences II — IV are produced by restriction at 

30 the following sites (denoted by the first nucleotide 
number at the 5' end of each site). 

II 3624 — 4769 

HI 6380 — 7378 

IV 10589—11868 

35 Particularly important sites are arrowed in Figure 
8. Some of the relevant nucleotide numbers are 
shown in Figure 6(c), the number given being that 
of the nucleotide at the 5' end of each site- 
Further sequence analysis of the sub-clones V, 

40 VI, VII and VIII show>r in Figure 6(c) indicates that 
the factor IX gene is divided into at least 7 exon 
regions separated by at least 6 introns. The 
positions of the exons are shown in Figure 6(a) by 
the solid blocks labelled t, u, v f w r x, y and z. The 

45 "z" exon is much the longest and its 3'-end 
coincides with the 3'-end of the mRNA. The 
location of these exons relative to the cDNA 
sequence is discussed below (section L) and It is 
clear that the "t" exon shown in Figure 6(a) is not 

50 a marker for the 5'-end of the gene, as its 

sequence fails to match that of the extreme 5'-end 
of the cDNA clone (see below). This suggests that 
the factor IX gene will be longer at its 5'-end than 
the 27 kb region shown in Figure 6, and will 

55 contain at least one further exon. 

Additionally, pHIX — 1 7 DNA was digested with 
EcoRl. The digested material was resolved on 
0.8% agarose gel and a 1 .4 kb fragment was 
isolated in solution by electroelution. It can be 

60 stored in the usual manner. This 1 .4 kb long 

molecule was used for the initial sequencing. Only 
about 1 .0 kb is inserted DNA, the remaining 0.4 kb 
being of pBR322. A 403 nucleotide length of the 



inserted DNA was sequenced and is identified as 
65 0 — O' in Figure 7. The same 1 .4 kb fragment was 
also labelled and used as a probe in Section M. 

I. Construction of a vector pAT1 53/PvuII/8 

A derivative of the plasmid pAT1 53 (Twig & 
Sherratt, Nature 283, 21 6— 21 8, 1 980) was 

70 prepared for subciontng of PvuW fragments of 
factor IX genomic clones, and for ease of 
characterisation of the resultant subclones. Two 
partially complementary synthetic 
deoxyoligonucleotides, oligo N3, and, oligo N4, 

7 5 were synthesised by the solid phase 

phosphotriester method described in Section C 
above. Each has "overhanging" BamH\ and HinM\ 
recognition sequences and an internal PvuW 
- recognition sequence. Figure 1 0 shows the 

80 structures of oligo N3 and oligo N4. BamHl and 
Hind\\\ cleave ds DNA to leave sticky or 
"overhanging" ends. For example Hind\\\ cleaves 

— AAGCTT 

— TTCGAA 

85 between the adenine-carrying nucleotides of each 
strand leaving the sticky-ended complementary 
strands: — 

— A 

— TTCGA 

90 which are present in the oligo N3/N4 combination. 
pAT1 53 was digested with H/ncfill and BamHl 
and the 3393 nucleotide long linear fragment was 
separated from the 346 nucleotide shorter 
fragment by 0.7% agarose gel electrophoresis, 
95 followed by electroelution of the appropriate 
bands visualised by ethidium bromide 
fluorescence under UV light. After treatment with 
calf intestinal phosphatase, as described in 
Section D(i), the BamH\-H/nd\l\ 3393-long 

1 00 fragment was ligated to an equimolar mixture of 
oligo N3 and oligo N4 which themselves had been 
pretreated, as a mixture, with T4 polynucleotide 
kinase and ATP, to phosphorylate their respective 
5'-terminal OH groups. After transforming 

1 05 competent MC 1 061 cells (see above) and plating 
on L-broth plates containing 20 micrograms/mi 
final concentration of ampiciliin, 1 1 colonies were 
selected for further analysis. 1 ml plasmid 
preparation, see Holmes and Quigley, Analytical 

1 1 0 Biochem. / 14, 1 93—1 97 (1 981 ), was isolated 
from the 1 1 colonies. The plasmid DNA was then 
analysed for its ability to be linearised by the 
restriction enzymes BamHl, HindlW and PvuW. Four 
clones were positive in this assay and one, 

1 1 5 labelled pAT1 53/Pvull/8, was selected for 

sequence analysis by the Maxam-Gilbert method 
across the newly constructed section of the 
plasmid. This part of the sequence is shown in 
Figure 1 1 along the unique restriction sites. The 

120 novel part of the plasmid sequence is underlined: 
the remainder is present in the parent plasmid 
pAT1 53. The vector allows blunt-end cloning 
(after treatment with phosphatase) into the 
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inserted PvuW site. The cloned DNA can be 
excised, assuming that it lacks appropriate interna! 
restriction sites, with BamW\/Hind\\\ f BamH\/Cla\ or 
BamH\/EcoR\ double digests. The sites adjacent to 
5 the PvuW site are also convenient for end labelling 
with 32 P for characterization of the ends of cloned 
DNA by the Maxam-Gilbert sequencing method. 

J. Sub-cloning of human factor IX gene 

The following subcloning experiments were 

1 o carried out as a first step towards sequencing of 
the factor IX gene, and to facilitate the isolation of 
a small DNA fragment to be used as a probe for 
the analysis of genomic DNA from haemophilia B 
patients (see sections M). 

1 5 (i) Sub-cloning into pBR322 plasmid 

An approximately 1 1 kilobase Bg/W fragment 
(see Figure 6) within the factor IX DNA insert in 
lambda HIX — 1 clone was inserted into the BamHl 
site of pBR322, Transformation was carried out in 

20 ^e E.coli strain, HB 101. The resulting "sub- 
clone" was designated pHIX — 1 7 {Figure 1 2). 
.(i) Sub-cloning into pAT7 53/Pvull/8 
(a) Plasmid DNA from pHIX — 1 7 was prepared 
and cleaved with PvuW. Five discrete fragments, ail 

25 derived from the DNA insert of pHIX — 1 7, were 
isolated. The sizes of these fragments were 
approximately 2.3, 1.3, 1.2, 1.1 and 1.0 kilobases. 
These fragments were blunt-end ligated into the 
PvuW site of the pAT1 53/PvuII/8 vector and 

30 transformed into E.coli HB 101. Five clones of 
recombinant DNA which carried the 2.3, 1 .3, 1 .2, 
1.1 and 1 .0 kb fragments were obtained and these 
were designated pATIXPvu-1 , 2, 3, 4 and 5 
respectively. Factor IX DNA from pATIXPvu-2 is 

35 abbreviated as IV and pATIXPvu-5 as Hi in Figure 
6(c). 

(b) Phage DNA from the lambda HIX — 1 genomic 
clone was digested with EcoB\. Three different 
fragments (approximately 5, 2.3, 0.96, kb; see 

40 Figure 6), all derived from the insert into the 
phage, were isdlated and inserted in 
pAT1 53/Pvull/8 vector at the EcoR\ site and 
cloned in E.cofi HB 101 to form sub-clones. The 
three resulting clones for each of these fragments 

45 were designated pATIXEco-1 , 2 and 4 respectively 
which are shown in the restriction map of Figure 
6(d). pATIXEco-1 was further digested with both 
EcoR\ and Bg/W f and the "overhanging ends" of the 
restriction sites filled in with deoxynucleotide 

50 triphosphates using the Klenow fragment of DNA 
polymerase I. After isolation of the resulting 1 .1 kb 
fragment by agarose gel electrophoresis and 
electroelution, it was blunt-end ligated using T4 
DNA ligase into the PvuW site of pAT1 53/Pvull and 

55 allowed to transform E.coli MC 1 06 1 . The 

resultant sub-clone was designated pATlXBE and 
the factor IX DNA sequence thereof is abbreviated 
as II in Figure 6(c). 

(c) Phage DNA from lambda HIX — 2 was 
60 digested with HindlW and EcoHl giving a 1 .8 kb 

and a 2.6 kb fragment amongst others. These 
fragments were eluted separately, filled in as 
described in (b) above, cloned as above into the 
PvuW site of pAT1 53/PvulI/8 and allowed to 



65 transform E.coli MC 1 06 1 . The resultant clones 
were designated pATIXHE — 1 , and the factor IX 
DNA sequence thereof is abbreviated as V in 
Figure 6(c), and pATIXEco — 6 and the factor IX 

70 DNA sequence thereof is abbreviated as VI in 
Figure 6(c). 

(d) Phage DNA from lambda HIX — 3 was 
digested with EcoHl and Hind ill and the fragments 
of 2.3 kb and 2.7 kb were sub-cloned exactly as 

75 described in (c) above. The resultant clones were 
designated pATIXEH — 1 , abbreviation VII in Figure 
6(c), and pATIXHE — 2, abbreviation VIII in Figure 
6(c). 

K. Preparation of a library of cDNA clones from 

80 human liver mRNA 

Messenger RNA was extracted from a human 
liver and a 20 — 22 Svedberg unit enriched fraction 
of mRNA prepared exactly as described for bovine 
mRNA in Section B above, except that a 

85 'translation assay' was not used. The first steps in 
the construction of the double-stranded DNA were 
carried out using the 'Stanford protocol' kindly 
supplied from Professor P Berg's department at 
Stanford University, USA. This itself is a 

90 modification of Wickens, Buell & Schimke 
(J.Biol.Chem. 253, 2483—2495, 1 978) and 
some further modifications, incorporated in the 
description given below were made in the present 
work. 

95 For the first strand cDNA synthesis 6 

micrograms of poly(A) + 20 — 22S human mRNA 
was incubated with 5 microlitres of 1 0x buffer 
(0.5 M Tris-chloride, pH 8.5 at room temperature, 
0.4 M KCI, 0.008M MgCI 2 and 4 mM 

1 00 dithiothreitol), 20 microlitres of a 2.5 mM mixture 
of each of the four deoxynucleoside triphosphates, 
0.5 microlitres of oligo dT {12 _ 1s> , 1 microlitre 
(containing 0.5 microcurie) of [a!pha- 32 P] -dATP, 2 
microlitres of reverse transcriptase ( 1 4 units per 

1 05 microlitre) and the volume made up to 50 

microlitres with deionized water. After incubation 
for 1 hour at 42 °C, the solution was boiled for 1 £ 
minutes and then rapidly cooled on ice. The 
second strand synthesis was carried out by adding 

110 directly to the above solution 20 microlitres of 5x 
second strand buffer (250 mM Hepes/KOH pH 6.9, 
250 mM KCI, 50mM MgCljJ, 4 microlitres of a 
2.5 mM mixture of each of the four 
deoxynucleoside triphosphates, 10 microlitres of 

1 1 5 E.cofi DNA polymerase I (6 units per microlitre) 
and making the volume of the solution up to 1 00 
microlitres with deionized water. After incubation 
for 5 hours at 1 5°C, S 1 nuclease digestion was 
carried out by addition of 400 microlitres of S, 

1 20 nuclease buffer (0.03 M sodium acetate pH 4.4, 
0.25 M NaCI, 1 mM ZnS0 4 ) and 1 microlitre of 
nuclease (at 500 units per microlitre). After 
incubating for 30 minutes at 37°C, 1 0 microlitres 
of 0.5M EDTA (pH 8.0) was added. Double 

125 stranded DNA was deproteinised by shaking with 
an equal volume of a phenol; chloroform (1:1) 
mixture, followed by ether extraction of the 
aqueous phase and precipitation of ds DNA by 
addition of 2 volumes of ethanol. After 1 6 hours at 



— 20°C, ds DNA was recovered by centrlfugation. 
DNA polymerase I "fill in" of S t ends was carried 
out by a further incubation of the sample dissolved 
in 25 microlitres of 50 rnM tris-chloride, pH 7.5, 
5 1 0 mM MgCI 2 , 5 mM dithiothreitol and containing 
0.02 mM dNTP and 6 units of DNA polymerase I. 
After incubating for 1 0 minutes at room 
temperature, 5 microlitres of EDTA (0.1 M at pH 
7.4) and 3 microlitres of 5% sodium dodecyl 
1 0 sulphate (SDS) were added. 

The following part of the protocol differs from 
the 'Stanford protocol'. The sample was 
fractionated on a "mini' -Sephacryl S400 column 
run in a disposable 1 ml pipette in 0.2 M NaCl f 1 0 
1 5 mM tris-chloride, pH 7.5 and 1 mM EDTA. The 
first 70% of the "break-through" peak of 
radioactivity was pooled (0.4 ml) and 
deproteinised by shaking with an equal volume of 
n-butanol:chloroform (1 :4). To the aqueous phase 
20 was added 1 microgram of yeast RNA (BDH) as 
carrier followed by 2 volumes of ethanol. After 1 6 
hours at ~20°C double stranded DNA was 
recovered by centrifugation for blunt-end ligation 
into calf intestinal phosphatase-treated PvuW-ouX 
25 p At1 53/Pvull/8, using T4 DNA ligase (see I and 
J(ii) above). After performing a trial experiment, it 
was found that when the bulk of the sample was 
incubated with 200 nanograms of vector DNA in a 
suitable buffer (1 mM ATP , 50 mM Tris-chloride, 
30 P H 7.4, 1 0 mM MgCI 2 and 1 2 mM dithiothreitol) 
and using 1 0 microlitres of T4 DNA ligase in a 
total volume of 0.2 ml, then on subsequent 
transformation of competent E.coIiUO 1061 cells 
a total of 58,000 ampicillin-resistant colonies 
35 were obtained. Up to 20% of these were 

estimated to derive from "background" non- 
recombinants derived by religation of the vector 
itself. This 20— 22S cDNA library was amplified 
by growing the E.cofi for a further 6 hours at 37 °C. 
40 1 ml aliquots of this amplified library were stored at 
—20°C in L broth containing 1 5% glycerol, before 
screening for factor IXeDNA clones. 

L Isolation and sequence analysis of human factor 
IX cDNA clones 

45 6000 colonies of the amplified 20 — 22S 
human cDNA library were plated on each of ten 
1 5 cm agar plates and after growing overnight were 
blotted into Whatman 541 filter paper. After 
preparing filters for hybridisation as described in 

50 section E(i) above, the immobilised colonies were 
probed with a 1 .1 kb molecule of [alpha- 32 P] -nick 
translated human factor IX genomic DNA isolated 
from the pATIXBE subclone (Section J, above). 
This linear 1 .1 kb section of factor IX genomic 

55 cDNA was isolated from pATIXBE by cleavage 
with the restriction enzymes BamH\ and HinM\, 
followed by separation of the 1 .1 kb section from 
the vector by 1 .5% agarose gel electrophoresis. After 
electroelution, nick-translation was carried out as 

60 before and the material used in a hybridisation 
reaction for 1 6 hours at 65°C in 3x SSC, 1 0x 
Denhardts solution, 0.1% SDS and 50 
micrograms/ml sonicated denatured E.coli DNA 
and 100 micrograms/ml of sonicated denatured 



65 herring sperm DNA. After hybridisation filters were 
washed at 65 °C successively in 3x SSC, 0.1% 
SDS (2 changes, half an hour each) and 2x SSC, 
0.1 % SDS (2 changes, half an hour each). After 
radioautography, 7 clones were selected as 

70 positive, but on dilution followed by re-screening 
by hybridisation as above, only 5 proved to be 
positive. Piasmid DNA was isolated from each of 
these 5 clones and one, designated pATIXcVII, 
was selected for sequence analysis as it appeared 

75 to be the longest of the 5 clones as judged by its 
electrophoretic mobility on 1% agarose gel 
electrophoresis. A second shorter clone, 
designated pATIXcVII was also selected for partial 
sequence analysis. 

80 Sequencing was carried out by the Maxam- 
Gilbert method and a 2778 nucleotide long 
section of sequence is shown in Figure 9. 
Nucleotides 1 1 5 — 2002 were derived by 
sequencing clone pATIXcVII. (The actual extent of 

85 this clone is greater as it extends in a 5' direction 
to nucleotide 1 7. The sequence between 1 7 and 
1 1 1 is inverted with respect to the remainder of 
the sequence presumably due to a cloning 
artefact.) Nucleotides 1 — 130 were derived from 

90 clone pATIXcVI which extends from nucleotides 
1 — 1 548 of Figure 9. The sequence from Nos. 
2002 — 2778 was derived by isolating 4 
additional clones designated pATIX108.1, 
pATIXI 08.2, pATIXI 08.3 and pATlXDB. The first 

95 3 were derived from a mini-library (designated 
GGB1 08) of the cDNA clones constructed exactly 
as described in section K above except that 
sucrose density gradient centrifugation was used 
instead of chromatography on "Sephacryl" 
1 00 S — 400 to fractionate the double-stranded DNA 
according to size. A fraction of m.w. from 1 kb — 5 
kb was selected and an amplified library of 1 0,000 
independent clones containing approximately 20% 
background non-recombinant clones was 
1 05 obtained. Clone pATIXDB derived from another 
cDNA library (designated DB1) constructed as 
described in section K except that total poly A+ . 
human liver mRNA was used as the starting 
material and sucrose density gradient 
f i o centrifugation was used to fractionate the DNA 
according to size as in the construction of the 
mini-library GGB 1 08. The complexity of this 
library was 95,000 with an estimated background 
of non-recombinants of 50%. Clones pATIXI 08.1 
115 and pATIXI 08.2 were selected from a group of 30 
hybridization-positive clones isolated by 
Grunstein-Hogness screening of the mini library 
GGB 1 08 using a 32 P-nick translated probe derived 
from a Sau3A\ restriction enzyme fragment, itself 
1 20 derived from nucleotides 1 796 — 2002 of clone 
pATIXcVII. From pATIXI 08.1 the sequence of 
nucleotides 2009 — 2756 was determined (Figure 
9). Following this the sequence of a part of 
pATIXI 08.2, specifically nucleotides 
1 25 1 950 — 2086, provided the overlap with 
pATIXcVII. The remaining 28 hybridization 
positive clones were screened by carrying out a 
triple enzymatic digestion with the restriction 
enzymes fcoRI, BamH\ and Hind[\\ and screening 
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the product of the digest for an FcoRI restriction 
fragment extending in the 3' direction from the cut 
at position 2480. By this approach, clone 
pATIXI 08.3 was selected and sequenced from 
5 nucleotides 2642 — 2778. This clone was 

followed by three A nucleotides, which sequence 
was confirmed as a vestigial marker for the poly A 
tail, by the subsequent isolation of clone pATIXDB 
by a similar method. pATIXDB was sequenced 

1 o from Nos. 2760 — 2778 and ended in 42 A 

nucleotides, thus marking the 3' end of the mRNA, 

Figure 9 shows that the predicted amino acid 
sequence codes codes for a protein of 456 amino 
acids, but included in this are 41 residues of 

1 5 precursor amino acid sequence preceding the N- 

terminal tyrosine residue (Y) of the definitive 
length factor IX protein. The precursor section of 
the protein shows a basic amino acid domain 
(amino acids —1 to —4) as well as the more usual 

20 hydrophobic signal peptide domain (amino acids 
-21 to -36). 

The definitive factor IX protein consists of 41 5 
amino acids with 1 2 potential gamma- 
carboxyglutamic acid residues at amino acids 7, 8, 

25 1 5, 1 7, 20 r 2 1 , 26, 27, 30, 33, 36 and 40. Two 
potential carbohydrate attachment sites occur at 
amino acid residues 1 57 and 1 67. The activation 
peptide encompasses residues 1 46 — 180, which 
are cut out in the activation of Factor IX (see 

30 Background of Invention) by the peptide cleavage 
of an R— A and R — V bond. This leaves a light 
chain spanning residues 1 — 145 and a heavy 
chain spanning residues 181 — 415. 

The exact location of the boundaries between 

35 exons (see Section H, above) and how they are 
joined in the mRNA is marked in Figure 9. The 
exons are marked t, u, v, w, x, y, z. It can be seen 
that there is a rough agreement between the exon 

40 domains and the protein regions. For example, the 
exon for the signal peptide is distinct from that of 
the GLA region. Also that of the activation peptide 
is separated frorrf the serine protease domain, 

The 3' non-coding region of the mRNA is 
extensive, consisting of 1390 residues (including 

45 the UAAUGA double terminator 1 389 — 1 394 but 
excluding the poly A tail). 

The factor IX cDNA is cleavable by the 
restriction enzyme Hae\\\ to give a fragment from 
nucleotides 1 33 — 1440 i.e. a 1307 nucleotide 

50 long region of DNA entirely encompassing the 
definitive factor IX protein sequence. The 
nucleotide sequence recognised byrVaelll is 
GGCC. This fragment should be a suitable starting 
material for the expression of factor IX protein 

55 from suitable promoters in bacterial, yeast of 

mammalian cells. Another suitable fragment could 
be derived using the unique Stu\ site at residue 41 
(corresponding to an early part of the hydrophobic 
signal peptide region) and linking it to a suitable 

60 promoter. The nucleotide sequence recognised by 
Stu\ is AGGCCT 



M. Southern Analysis of normal and patient 
Christmas disease DNA 

(i) Normal 

65 The standard (Southern) blotting procedure, 
Southern, J.Moi. Biol. 98, 503—5 17,1975) was 
used. In a typical experiment, 10 — 20 micrograms 
of human genomic DNA (prepared from 
uncultured blood cells or cultured lymphocytic 

70 cells) were digested with one of a number of 

. restriction endonucleases and loaded onto a single 
gel slot. Following electrophoresis on 0.8% 
agarose gel and transfer onto nitrocellulose it was 
hybridised with a probe of 32 P- labelled probe II or 

75 of 1 .4 kb £coRI fragment (see Section H). 
Labelling of the probe was carried out by nick 
translation using the method of Rigby et aL, supra, 
modified as follows. About 1 00 nanograms of the 
probe was mixed with 40 microcuries of [alpha 

BO 32 P ] dATp ( activlty about 3 000 Curies/mMoie, 
obtained from Amersham International PLC) in 
0.05M Tris-HCI, pH 7.5, 0.01 M MgCI 2 , 0.001 M 
dithiothreitol and dCTP, dGTP, dTTP each at a 
final concentration of 20 micromolar in a volume 

85 of 29 microlitres. To this was added 1 microlitre of 
"solution X" made up of a mixture of 6 units of 
DNA polymerase I {E.cof/), 0.6 nanograms of 
pancreatic DNase I (Worthington), 1 microgram of 
crystalline BSA in 1 0 microlitres of 50% v/v 

90 glycerol containing 0.05M Tris-HCI, pH 7.5, 
0.01 M MgCI ? and 0.001 M dithiothreitol. The 
mixture was incubated for 2 hours at 1 5°C, after 
which high molecular weight DNA was purified by 
chromatography on G — 100 "Sephadex". Figure 

95 1 3 shows the major bands obtained with DNA 
from normal individuals probed with either probe II 
(Figure 6) or labelled 1 .4 kb EcoRl fragment. With 
each of the 4 enzymes used, £cc?RI, Hind\\\, Bgl\\ 
and BcA, a single major band of about 4.8, 5.2, 1 1 
1 00 and 1 .7 kb was obtained. 

The fact that these restriction fragments had 
the same length as those observed in the 
restriction map of clone lambda HIX — 1 confirmed 
that the conditions of Southern blotting were 
1 05 precise enough to detect the factor IX gene in total 
DNA preparations. This provides the basis for 
analysis of DNA from the blood of patients with 
Christmas disease. 

(ii) Christmas patients with gene deletions 

1 1 0 The value of the probes of the invention for the 
assay of alterations of genes of some patients 
suffering from Christmas disease has been 
demostrated as follows. Two patients with severe 
Christmas disease, who also developed antibodies 

1 1 5 to factor IX, were selected for study. The DNA 

from 50 mf of blood was digested separately with 
EcoRl, Hind\\\, BglW and Bell and Southern blots 
prepared for probing with 32 P-nick translated 
probe II (Figure 6). No specific bands were 

1 20 observed with either patient under conditions 
where a control digest gave the pattern shown in 
Figure 1 3. Similarly no bands were observed in 
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the patients when probe I, 111 or IV (Figure 6) was 
substituted for probe II. In order to control for 
possible mischance of some experimental artefact 
giving the observed 'negative' signal, a factor IX 
5 gene probe (this time pATiXcVII — the cDNA 
probe) was mixed with an irrelevant autosomal 
gene probe which was specific for the human Al 
apolipoprotein (Shoulders and Baralle, NucLAcids 
Res. 10, 4873 — 4882, 1 982). This experiment 

1 0 showed that patient 1 had normal Al 

apolipoprotein gene, characterised by a 12 kb 
band in an EcoH\ digest, and confirmed that he 
lacked the 5.5 kb band observed with pATiXcVII 
and characteristic of the normal factor IX gene. It 

1 5 was concluded that both patients have a sequence 
of at least 1 8 kb deleted from their factor IX gene. 
Two other patients, designated patients 3 and 4, 
who had also developed antibodies to factor IX 
gave bands in the normal or abnormal positions on 

20 Southern blots with some factor IX gene probes of 
the invention, but not with others. This suggested 
that these patients had less extensive deletions of 
the gene, possibly about 9 kb in length. 
These results suggest that diagnosis of 

25 haemophiliacs and the heterozygous (carrier) 
females would be possible in families and this is 
now under examination. The altered pattern seen 
in the patient's DNA, whether absence of a band 
or the presence of a band in an abnormal position, 

30 serves as a "disease marker", which can be used 
to assess for its presence or absence in a 
suspected carrier. This same test can be applied to 
antenatal diagnosis, if DNA from foetal cells are 
available from an amniocentesis. "Genetic 

35 diagnosis" should considerably improve existing 
methods of antenatal diagnosis based on the 
assay of foetal factor IX protein levels, with the 
added advantage that the test can be carried out 
earlier in pregnancy. Genetic methods using 

40 natural polymorphisms within the factor IX gene 
as allelic markers should also make 1 00% carrier 
deletion a reality, thereby improving the existing 
somewhat unsatisfactory methods where 
probability values are offered to patients. 

45 CLAIMS 

1 . Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a DNA 
sequence foreign to the cloning vehicle, the 
foreign sequence comprising substantially the 

50 following 129-nucleotide sequence (read in rows 
of 30 across the page): — 

ATGTAACATG TAACATTAAG AATGGCAGAT 

GCGAGCAGTT TTGTAAAAAT AGTGCTGATA 

ACAAGGTGGT TTGCTCCTGT ACTGAGGGAT 

55 ATCGACTTGC AGAAAACCAG AAGTCCTGTG 

AACCAGCAG 

2. Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a DNA 



sequence foreign to the cloning vehicle, the 
60 foreign sequence comprising substantially the 
following 203-nucleotide sequence (Fead in rows 
of 30 across the page): — 

TGCCATTTCC ATGTGGAAGA GTTTCTGTTT 
CACAAACTTC TAAGCTCACC CGTGCTGAGG 
65 CTGTTTTTCC TGATGTGGAC TATGTAAATT 
CTACTGAAGC TGAAACCATT TTGGATAACA 
TCACTCAAAG CACCCAATCA TTTAATGACT 
TCACTCGGGT TGTTGGTGGA GAAGATGCCA 
AACCAGGTCA ATTCCCTTGG CAG 

70 3. Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a sequence 
foreign to the cloning vehicle, the foreign 
sequence being substantially the same as a 
sequence occurring in the human factor IX 

75 genome. 

4. Recombinant DNA according to Claim 3 
wherein the human factor IX sequence has a 
length of at least 50 nucleotides. 

5. Recombinant DNA according to Claim 3 
80 wherein the length of the human factor IX 

sequence is from 75 to 27,000 nucleotides. 

6. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
foreign to the cloning vehicle, wherein the foreign 

85 sequence includes substantially the whole of an 
exon sequence of the human factor IX genome. 

7. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
foreign to the cloning vehicle, wherein the foreign 

90 sequence comprises a DNA sequence which is 
complementary to the human factor IX mRNA. 

8. Recombinant DNA according to Claim 3, 4 or 
5, wherein the cloning vehicle is a modified 
pAT1 53 plasmid prepared by ligating a BamHl 

95 and Hind\\\ double digest of pAT1 53 to a pair of 
complementary double sticky-ended 
oligonucleotides having a DNA sequence 
providing a BamHl restriction residue at one end, a 
HindlU restriction residue at the other end and a 
1 00 PvuW restriction site in between. 

9. Recombinant DNA according to Claim 8 
wherein the pair of complementary 
oligonucleotides are of formula: — 

5' GATCCAGCTGA 3' 



3' GTCG ACTTCG A 5' 

1 05 10. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
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foreign thereto which hybridises to a 247 base- 
pair sequence of bovine factor IX DNA 
complementary to messenger RNA and indicated 
in Figure 5 by the arrows at each end thereof. 
5 1 1 . A host transformed with at least one 
molecule per cell of recombinant DNA claimed in 
any preceding claim. 

1 2. A host according to Claim 1 1 in the form of 
E.colL 

10 1 3. A host according to Claim 1 1 in the form of 
mammalian tissue cells, 

1 4. A process of preparing a host transformed 
with recombinant DNA as claimed in any one of 
Claims 1 to 7, which process comprises: — 

1 5 (1 ) synthesising an oiigodeoxy nucleotide probe 
having a nucleotide sequence comprising that 
occurring in bovine factor IX messenger RNA 
coding for amino acids 70 — 75 or 34$ — 352 of 
bovine factor IX and labelling the 

20 oiigodeoxy nucleotide to form a probe; 

(2) preparing complementary DNA to a mixture of 
bovine RNA; 

(3) inserting the complementary DNA in a cloning 
vehicle to form a mixture of recombinant bovine 

25 cDNAs; 

(4) transforming a host with said mixture of 
recombinant bovine cDNAs to form a library of 
clones and multiplying said clones; 

(5) probing the clones with the synthetic 

30 oligodeoxynucleotide probe obtained in step 1 and 
isolating a resultant recombinant bovine factor IX 
cDNA-containing clone; 

(6) digesting the recombinant bovine factor IX 
cDNA from said clone with one or more enzymes 

35 to produce a bovine factor IX cDNA molecule 

containing a shorter sequence of bovine factor IX 
DNA; and 

(7) probing a library of recombinant human 
genomic DNA in a transformed host with the 

40 shorter sequence bovine factor IX cDNA molecule, 
to hybridise the human genomic DNA to the said 
recombinant bovine factor IX DNA and isolating 



the resultant recombinant DNA-transformed host 

1 5. A process of preparing a host transformed 
45 with recombinant DNA as claimed in Claim 1 , 2 or 

7, which process comprises probing a library of 
clones containing recombinant DNA 
complementary to human mRNA with a probe 
comprising a labelled DNA comprising a sequence 
50 complementary to part or ail of an exon region of 
the human factor IX genome. 

16. A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part or all of 
substantially the 1 29-nucleotide sequence set 

55 forth in Claim 1. 

1 7. A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part or all of 
substantially the 203-nucleotide sequence set 
forth in Claim 2. 

60 1 8 A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part only of the DNA 
sequence of the human factor IX genome. 

1 9. A DNA molecule comprising a sequence of 
length at least 1 5 nucleotides substantially the 

65 same as a sequence complementary to part or all 
of that occurring in human factor IX mRNA. 

20. A DNA molecule according to any one of 
Claims 1 6 to 1 9 of length at least 50 nucleotides. 

21 . An artificial DNA molecule comprising a 
70 sequence substantially the same as a sequence of 

length at least 1 5 nucleotides occurring in the 
human factor IX genome. 

22. An artificial DNA molecule according to 
Claim 21 comprising substantially only exon 

75 sequences. 

23. A labelled diagnostic probe comprising a 
DNA molecule having a single-stranded or double- 
stranded probe sequence of from 1 5 to 1 0,000 
nucleotides long of DNA sequence defined in 

go Claim 1 6, 1 7, 1 8 or 1 9 or its complementary 
sequence. 

24. A probe according to Claim 23 having a 
probe sequence from 20 to 5,000 nucleotides 
long. 
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